17 research outputs found

    Analysis and visualisation of RDF resources in Ondex

    Get PDF
    An increasing number of biomedical resources provide their information on the Semantic Web and this creates the basis for a distributed knowledge base which has the potential to advance biomedical research [1]. This potential, however, cannot be realized until researchers from the life sciences can interact with information in the Semantic Web. In particular, there is a need for tools that provide data reduction, visualization and interactive analysis capabilities.
Ondex is a data integration and visualization platform developed to support Systems Biology Research [2]. At its core is a data model based on two main principles: first, all information can be represented as a graph and, second, all elements of the graph can be annotated with ontologies. This data model conforms to the Semantic Web framework, in particular to RDF, and therefore Ondex is ideally positioned as a platform that can exploit the semantic web. 
The Ondex system offers a range of features and analysis methods of potential value to semantic web users, including:
-	An interactive graph visualization interface (Ondex user client), which provides data reduction and representation methods that leverage the ontological annotation.
-	A suite of importers from a variety of data sources to Ondex (http://ondex.org/formats.html)
-	A collection of plug-ins which implement graph analysis, graph transformation and graph-matching functions.
-	An integration toolkit (Ondex Integrator) which allows users to compose workflows from these modular components
-	In addition, all importers and plug-ins are available as web-services which can be integrated in other tools, as for instance Taverna [3].
The developments that will be presented in this demo have made this functionality interoperable with the Semantic Web framework. In particular we have developed an interactive importer, based on SPARQL that allows the query-driven construction of datasets which brings together information from different RDF data resources into Ondex.
These datasets can then be further refined, analysed and annotated both interactively using the Ondex user client and via user-defined workflows. The results of these analyses can be exported in RDF, which can be used to enrich existent knowledge bases, or to provide application-specific views of the data. Both importer and exporter only focus on a subset of the Ondex and RDF data models, which are shared between these two data representations [4].
In this demo we will show how Ondex can be used to query, analyse and visualize Semantic Web knowledge bases. In particular we will present real use cases focused, but not limited to, resources relevant to plant biology. 
We believe that Ondex can be a valid contribution to the adoption of the Semantic Web in Systems Biology research and in biomedical investigation more generally. We welcome feedback on our current import/export prototype and suggestions for the advancement of Ondex for the Semantic Web.

References

1.	Ruttenberg, A. et. al.: Advancing translational research with the Semantic Web, BMC Bioinformatics, 8 (Suppl. 3): S2 (2007).
2.	Köhler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Ruegg, A., Rawlings, C., Verrier, P., Philippi, S.: Graph-based analysis and visualization of experimental results with Ondex. Bioinformatics 22 (11):1383-1390 (2006).
3.	Rawlings, C.: Semantic Data Integration for Systems Biology Research, Technology Track at ISMB’09, http://www.iscb.org/uploaded/css/36/11846.pdf (2009).
4.	Splendiani, A. et. al.: Ondex semantic definition, (Web document) http://ondex.svn.sourceforge.net/viewvc/ondex/trunk/doc/semantics/ (2009).
&#xa

    Automating the gathering of relevant information from biomedical text

    Get PDF
    More and more, database curators rely on literature-mining techniques to help them gather and make use of the knowledge encoded in text documents. This thesis investigates how an assisted annotation process can help and explores the hypothesis that it is only with respect to full-text publications that a system can tell relevant and irrelevant facts apart by studying their frequency. A semi-automatic annotation process was developed for a particular database - the Nuclear Protein Database (NPD), based on a set of full-text articles newly annotated with regards to subnuclear protein localisation, along with eight lexicons. The annotation process is carried out online, retrieving relevant documents (abstracts and full-text papers) and highlighting sentences of interest in them. The process also offers a summary Table of the facts found clustered by type of information. Each method involved in each step of the tool is evaluated using cross-validation results on the training data as well as test set results. The performance of the final tool, called the “NPD Curator System Interface”, is estimated empirically in an experiment where the NPD curator updates the database with pieces of information found relevant in 31 publications using the interface. A final experiment complements our main methodology by showing its extensibility to retrieving information on protein function rather than localisation. I argue that the general methods, the results they produced and the discussions they engendered are useful for any subsequent attempt to generate semi-automatic database annotation processes. The annotated corpora, gazetteers, methods and tool are fully available on request of the author ([email protected])

    La sonie des sons impulsionnels (perception, mesures et modèles)

    No full text
    AIX-MARSEILLE2-BU Sci.Luminy (130552106) / SudocSudocFranceF

    Automating the gathering of relevant information from biomedical text

    No full text
    More and more, database curators rely on literature-mining techniques to help them gather and make use of the knowledge encoded in text documents. This thesis investigates how an assisted annotation process can help and explores the hypothesis that it is only with respect to full-text publications that a system can tell relevant and irrelevant facts apart by studying their frequency. A semi-automatic annotation process was developed for a particular database - the Nuclear Protein Database (NPD), based on a set of full-text articles newly annotated with regards to subnuclear protein localisation, along with eight lexicons. The annotation process is carried out online, retrieving relevant documents (abstracts and full-text papers) and highlighting sentences of interest in them. The process also offers a summary Table of the facts found clustered by type of information. Each method involved in each step of the tool is evaluated using cross-validation results on the training data as well as test set results. The performance of the final tool, called the “NPD Curator System Interface”, is estimated empirically in an experiment where the NPD curator updates the database with pieces of information found relevant in 31 publications using the interface. A final experiment complements our main methodology by showing its extensibility to retrieving information on protein function rather than localisation. I argue that the general methods, the results they produced and the discussions they engendered are useful for any subsequent attempt to generate semi-automatic database annotation processes. The annotated corpora, gazetteers, methods and tool are fully available on request of the author ([email protected]).EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Evaluation de la surveillance staturo-pondérale par les médecins au cours de la deuxième enfance

    No full text
    La prévalence de l'obésité n'a cessé de croître depuis quelques années, elle atteignait 16% en 2000 selon l'ANAES. Un dépistage précoce basé sur l'indice de masse corporelle et la courbe de croissance, présents dans tous les carnets de santé depuis 1995, est préconisé pour tenter d'enrayer cette augmentation. Mais sont-ils utilisés dans la pratique médicale quotidienne ? A partir d'une étude rétrospective de 101 carnets de santé d'enfants de 5 à 7 ans recueillis aux urgences pédiatriques de Nantes, nous avons déterminé que la courbe de corpulence n'était pas tracée dans 67,33% des cas, et l'indice de masse corporelle n'était pas calculé au niveau des examens obligatoires dans plus de 75% des cas. A l'inverse 2/3 des parents connaissent l'IMC, mais pas sa courbe. Il reste donc un long chemin à parcourir pour une prévention efficace, qui devra peut être passer par une formation sérieuse des professionnels de santé, une valorisation de l'acte médical préventif, une réorganisation du carnet de santé et une implication parentale.NANTES-BU Médecine pharmacie (441092101) / SudocPARIS-BIUM (751062103) / SudocSudocFranceF

    Performance modelling with UML and stochastic process algebras

    No full text
    Abstract: We describe a software toolset which allows UML modellers to annotate their models with performance information. An equivalent performance model is extracted from the UML, solved, and the results reflected back to the UML level. Used in this way, our toolset gives a high-level approach to software performance modelling where the benefits of the performance modelling process are achieved without significant additional notational burden

    Enhancing data integration with text analysis to find proteins implicated in plant stress response

    Get PDF
    High throughput genomic studies can identify large numbers of potential candidate genes, which must be interpreted and filtered by investigators to select the best ones for further analysis. Prioritization is generally based on evidence that supports the role of a gene product in the biological process being investigated. The two most important bodies of information providing such evidence are bioinformatics databases and the scientific literature. In this paper we present an extension to the Ondex data integration framework that uses text mining techniques over Medline abstracts as a method for accessing both these bodies of evidence in a consistent way. In an example use case, we apply our method to create a knowledge base of Arabidopsis proteins implicated in plant stress response and use various scoring metrics to identify key protein-stress associations. In conclusion, we show that the additional text mining features are able to highlight proteins using the scientific literature that would not have been seen using data integration alone. Ondex is an open-source software project and can be downloaded, together with the text mining features described here, from www.ondex.org
    corecore